#DeepLearning #Fine-Tuning #PEFT
Parameter-Efficient Fine-Tuning
Method
- Modified small part of parameters of pretrained models
- Add extra parameters
- Add increment of parameters of pretrained models
- LoRA, FACT
Modified small part of parameters of pretrained models
Bias-terms Fine-tuning BitFit 微调偏置项
Only modify (retraining) part of parameters of pretrained models
- Bias and the final linear layer
[!success]+ Pros
- Simple, but efficient, comparable to full model fine-tuning
- Can learn downstream tasks sequentially, which helps to deploy efficiently
- For each downstream task, only very small number of parameters should be stored
- For BERT backbone, 0.1% parameters are modified (retrained)
Add extra parameters
Adapter Tuning
+ Idea
Add Adapter module to each layer of pretrain models
Prefix Tuning
+ Idea
Use prefixs, which can be learned and specific to given tasks, to mimic prompts
Prompt Tuning
+ Idea
- Construct a prompt for each task, concatenate the prompt with data to input to Large Model
- Only add prompt (tokens) at input layer, no MLP A Simplified version of Prefix Tuning
Add increment of parameters of pretrained models
LoRA
+ Cons of previous tuning methods
- Adapter Tuning adds Adapter layer in Transformer, which makes the model more deeper and increases the inference time
- Prompt-based methods, such as Prefix Tuning and Prompt Tuning, are hard to train. In addition, the prompt token occupy the input space so as to decrease the number of available tokens